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Abstract 

We introduce a variational method for approximating distribution functions of dynamics 
with a "Liouville operator" L, in terms of a nonequilibrium action functional for two indepen- 
dent (left and right) trial states. The method is valid for deterministic or stochastic Markov 
dynamics, and for stationary or time-dependent distributions. A practical Raylcigh-Ritz 
procedure is advanced, whose inputs are finitely-parametrized ansatz for the trial states, 
leading to a "parametric action" for their evolution. The Euler-Lagrange equations of the 
action principle are Hamiltonian in form (generally noncanonical) . This permits a simple 
identification of fixed points as critical points of the parametric Hamiltonian. 

We also establish a variational principle for low-order statistics, such as mean values 
and correlation functions, by means of least effective action. The latter is a functional 
of the given variable, which is positive and convex as a consequence of Holder realizability 
inequalities. Its value measures the "cost" for a fluctuation from the average to occur and in 
a weak-noise limit it reduces to the Onsager-Machlup action. In general, the effective action 
is shown to arise from the nonequilibrium action functional by a constrained variation. This 
result provides a Rayleigh-Ritz scheme for calculating just the desired low-order statistics, 
with internal consistency checks less demanding than for the full distribution. 



1 Introduction 



The Rayleigh-Ritz variational method is a well-established technique in quantum mechanics 
(e.g. see jl]]). In this method one solves approximately the stationary Schrodinger's equation 
by making a physically motivated trial ansatz for the ground-state wavefunction and then 
varying the energy-expectation functional with respect to its parameters. A similar method is 
available for solving the time-dependent Schrodinger equation, based upon the Dirac-Frenkel 
dynamic variational principle ]2|, ||, |4|, @|. These methods are among the very few tools in 
the arsenal of theoretical physics able to assault systematically strong-coupling problems of 
quantum dynamics. They are especially useful in quantum field theory and many-body theory, 
where alternative numerical approaches are expensive or unfeasible. In some cases — such as the 
BCS theory of superconductivity — the variational principle has been the stepping-stone to an 
exact solution of the problem. 

In our opinion, nonequilibrium statistical mechanics has been lacking a variational principle 
of the same flexibility and scope as in quantum theory, capable of determining the probability 
density function (PDF) for both the steady-state and also the time-dependent solution to the 
initial-value problem. This is particularly true for problems such as high Reynolds number 
turbulence and large-scale dynamics of multiphase fluids, where there is no small parameter in 
which to make a perturbation expansion or asymptotic development and strong fluctuations 
dominate the phenomena on a wide range of length-scales. An obvious analogy exists between 
Schrodinger's equation for the wave-function and the Liouville equation for the PDF in the 
nonequilibrium problems: 

d t p = lp. (i.i) 

This analogy has been used before to express classical statistical dynamics as a formal quantum 
field theory in the work of Martin-Siggia-Rose (MSR) Q. It was noted in || that variational 
principles could be formulated, without any further details. However, a mathematical obstacle 
exists to applying by analogy the quantum principles because the formal "Hamiltonian" L is 
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generally non-Hermitian for the dissipative dynamical systems of interest. Variational methods 
of the standard form as in quantum mechanics have been employed in special cases where L can 
be transformed to Hermitian form |7], [8], || or else based upon the Hermitian squared operator 
DL @, 0. These methods seem to be either too restrictive or too cumbersome to be as useful 
as the corresponding quantum principles. Recently, we have observed in the turbulence context 
that a variational method may be developed for nonequilibrium dynamics which preserves the 
principal advantages of the quantum method fll|| . The key idea in the new formulation is to 
vary jointly over independent left and right trial states. Although this Rayleigh-Ritz method 
seems to be most natural for a non-Hermitian operator, it does not seem to have been previously 
used for nonequilibrium dynamics. It is our purpose here to develop this method in a general 
context and in some formal detail. 

One advantage of the variational method in our formulation is that it yields, by a procedure 
of constrained variation, a characterization of the effective action for any selected statistic of 
interest, such as a mean-value or a two-point correlation. The effective action is a non-negative, 
convex functional whose minimum is achieved by the true ensemble-average value. In quantum 
field theory the concept has it roots in the early work of Heisenberg &: Euler 11^] and Schwinger 



[13] in QED. In nonequilibrium statistical mechanics, the first such action principle seems to 



have been Onsager's 1931 "principle of least dissipation" [14], which applies to systems subject 
to thermal or molecular noise, governed by a fluctuation-dissipation relation. A formulation 
of the least-dissipation principle by an action functional on histories was developed in 1953 by 
Onsager and Machlup [|j|] . The effective action we consider coincides in a weak-noise limit with 



the Onsager-Machlup action, as discussed some time ago by Graham [16]. For vanishing noise, a 
path-integral formula for the effective action can be evaluated by steepest descent, yielding the 
"classical" action of Onsager-Machlup. However, in the strong-noise case, efficient calculational 
tools remain to be developed. We show here that the Rayleigh-Ritz method provides one such 
computational scheme. The basis of this method is a generalization of Symanzik's theorem in 
Euclidean field theory [o] (see also |TH]), which characterizes the static effective action, or, 



"effective potential," by a constrained variation of the quantum energy-expectation functional. 
This theorem has been extended by us to MSR field theory with non-Hermitian Hamiltonian 
operator Here we shall, for completeness, briefly recapitulate that result and then expound 
in detail the corresponding Rayleigh-Ritz method. We also establish a Symanzik-type theorem 
for the time-dependent effective action, extending the earlier result of Jackiw & Kerman in 
quantum theory to the initial- value problem in nonequilibrium statistical dynamics. 

The methods we develop here are quite general and apply, indeed, to the solution of any 
large-scale stochastic system, not only those in nonequilibrium statistical physics, but also to 
population dynamics in biology, to stochastic market models in mathematical finance, etc. The 
advantages of a variational scheme are well-known. For example, we quote: 

"The great virtue of the variational treatment, 'Ritz's method', is that it permits 
efficient use in the process of calculation, of any experimental or intuitive insight 
which one may possess concerning the problem which is to be solved by calculation. 
It is important to realize that this is not possible, or possible to a much smaller 
extent, if one performs the calculation by using the original form of the equations 
of motion... Ritz's method, on the other hand, is definitely a method of successive 
approximations, and one which converges better in the later stages of the approx- 
imation. Any information therefore which one may possess — no matter whether it 
comes from experiments, from intuition, or from general experience obtained in pre- 
vious works on similar problems — can be made useful by using it in formulating the 



point of departure, the 'zeroeth approximation'." (J. von Neumann, [20], p. 357). 



The present paper elaborates the theoretical foundation of such a variational scheme for stochas- 
tic dynamical systems. In future work we shall apply the method to various concrete systems 



of practical interest. In particular, the paper [21| demonstrates the feasibility of the Rayleigh- 



Ritz method for numerical computation of the effective potential, and |2^] applies the action 
principle to the problem of moment closures in turbulence modelling. 
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2 The Variational Method for Distributions 



Our problem is to calculate the probability distribution functions (PDF's), denoted by P, for 
nonequilibrium Markov dynamics, governed by an equation of the form of Eq. Ql.l| ), where L is 
the (forward) Markov generator. Concrete examples of practical interest are the nonequilibrium 



master equations [ 23 1 , and, as a particular case, the Fokker-Planck equations M, with 



£ + 5^ W x »- (2 ' 1> 

in which K is the drift vector and D is the diffusion tensor. A degenerate case of the latter of 
special interest occurs for zero noise (D = 0), which is 

L = ~{Ki{x).), (2.2) 

OXi 

the "Liouville operator" of the deterministic dynamical system x = K(x). 

We develop here a simple variational method to calculate approximately the solutions of 
the Eq. fll.lp for P, both for the stationary PDF, P s , and for time-dependent solutions P% with 
prescribed initial data Pq. Our methods are analogous to Rayleigh-Ritz procedures traditional 
in quantum mechanics, but with a modification due to the fact that the operator L is non-self- 
adjoint: 

D ^ L. (2.3) 

Although the spectra of L and L* are the same (because L is a real operator: L* = L), their 
eigenstates are distinct. Equivalently, the left and right eigenstates of L are distinct [jl], |24fl . 
This is particularly true for the "ground states" 

L\n R ) = tf\Q L ) = 0. (2.4) 

Because of the fundamental asymmetry of the problem, Hilbert space or L 2 methods are not 
so useful as in quantum theory. Instead, the standard mathematical formulation (see |25||) is 
to take L as an operator on L 1 , considered as a space of "normalizable states," and U as an 
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operator on considered as a space of "bounded observables." [] Although the inequality of 
the two ground states is a complication, there are special features that largely compensate for 
this. The "right ground state" Q R is the main unknown of the problem, the stationary PDF, 
P s , and it can always be taken to be non- negative 

K (x) > 0. (2.5) 

This is part of the statement of the Perron-Frobenius theorem, since e tL is an operator with 
strictly positive kernel: see [^], or Theorem 3.3.2 of [27|. On the other hand, the "left ground 
state" is known exactly a priori: 

ft L (x) = 1. (2.6) 

This latter fact turns out to be of great utility in our method. We discuss first the stationary 
problem and thereafter consider the time-dependent case. 
(i) Stationary Distributions 

Define a functional 7i of left and right state vectors, as 

H[^ R ,^ L ] = (V L ,L^ R ). (2.7) 

Then it is easy to see that £l L are uniquely characterized as the joint extremal point of the 
functional H: 

5H[y R ,^ L }=0 <-» (y R ,$> L ) = (n R ,tt L ). (2.8) 

In fact, 

8H[V R , = {5^ L , LV R ) + (^ L , L ■ 5V R ) = 0, (2.9) 

if and only if 

L|* R } = & L t |^ i } = 0. (2.10) 
1 The mathematical notation is, unfortunately, the opposite to that generally adopted in the physics literature: 
what we have called L, L' are in mathematics usually denoted as L* , L, (forward and backward Markov operators, 
respectively) . 
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As stated above, we take $> R G L 1 ("states") and ^ L G L°° ( "observables" ) , with 

(y L ,V R ) = J dx f L (x)*^ R (x). (2.11) 

The "inner product" notation is always used in this paper as the canonical sesquilinear associ- 
ation of ^ R G L 1 and ^ L G L°° with the complex number (ty L , ty R ) defined in Eq. (|2.11|) . 



This simple variational characterization of the ground states can be made the basis of a 
Rayleigh-Ritz method of approximation. To initiate this method, one must make trial ansatz 

y R = ^ R ( a ) & ^ L = V L (a), (2.12) 

for the ground states. Q The vector a = (atx, ...,ayv) denotes a set of iV real parameters (where 
possibly N = oo). In certain cases, we shall wish to have dependence of some parameters 
only in one of the vectors VP , H = L, R and we denote the corresponding parameters as 
a H = (pLi , affr H ), for H = L,R respectively. We then use o; to denote only the common 
parameters in both trial vectors. An interesting special case is when there are no such common 
parameters, i.e. 

\$R = \& R ( a R ) k $> L = ^ L (a L ), (2.13) 

and N L = N R , i.e. with equal numbers of the left- and right-parameters. The ansatz provide an 
explicit, but arbitrary, reduction of the original variational problem in an infinite-dimensional 
function space to an analogous problem in A^-dimensional Euclidean space. A given assumed 
form of the trial ansatz provides, in essence, a "nonlinear projection" of the original time- 
independent stationarity equations. This is the same general strategy proposed explicitly by 
Bayly under the term "parametric PDF closures" ]28| (and used implicitly by others before). 
Here we simply explain how this strategy may be implemented variationally. 
For any particular ansatz, we denote 

H(a) = {* L (a),L* R (a)), (2.14) 



2 Since we know Q L to be exactly equal to one, it may seem unnecessary to make an ansatz for it at all. 
However, variation over the "observables" is required to characterize the "state," or right ground-state £l R . 



which we call the (parametric) Hamiltonian. We may now seek for the extremal, or critical, 
points of H: 

f^(a„)=0. (2.15) 
This condition may be written more explicitly as 

(Vf(a*), *(a*)> + <*V*), LVf (a*)) = (2.16) 
for each i = 1, iV, where, in general, for H = L,R 

One may take the corresponding state vectors as the approximations to the ground states: 

^(x) = \P fl (x;«*) & n*(x) = * L (x;c). (2.18) 

In the special case Eq.( |2.13| ) with no common parameters, the variational equations become 
simply 

° = Sf (2-19) 

and 

° = ^ = ^(«*)» L *Vf)}, (2-20) 

with i = 1, iV(= N R = N L ). We may also write out the general Eq,( |2.16l) more explicitly as 
separate equations for the variations under each of a R , a L , and a. However, we have not found 
this version of the equations to be as useful, so that we relegate it to an Appendix. 

In general, the function 7i{a) may have more than one critical point. Some a priori criteria 
for selection of the critical point(s) of interest arise from the exact information for the problem 
that H[£l R , Vt L ] = and that £l L = 1. Hence, among the possible critical points, we should only 
accept those for which 

%)«0, (2.21) 

and 

f L (x;c**)«l. (2.22) 



The second condition generally implies the first. Hence, we should only accept those critical 
points for which is close to the constant 1. We refer to such critical points as "acceptable." 
Because of the acceptability condition, we see that the ansatz need really only explore the 
region near ^ L 1. Thus, we may without loss of generality assume that q[ < 1 and expand 
to linear order: 

N L 



* L (a,« L ) = l + ^afvf'(a), (2-23) 
i=l 

where, now, for H = L,R, tpf 1 (a, a H ) = ^^-(a, a H ), rather than Eq.( 2.17| ). Correspondingly, 

H(a R , a L , a) = af{ipi(a), L$> R (a, a R )) (2.24) 



(summation convention implied). |^| 

It is useful to consider the special case Eq.(pU3D with no common parameters, for which 
the variational Eqs. ( fOPD , (|2T20| ) b ecome simply 



and 



(2.26) 



(2.27) 



for i,j = 1, ...,N. If the matrix in Eq.( 2.26 ) is nonsingular, 



det 



(2.28) 



then the first of the variational equations has as its unique solution 



ai = 0. 



3 To guarantee € L°° , we should really take 



ty L (a, a L ) = exp 



(2.29) 



(2.25) 



However, this leads to equivalent results as Eq.(2.23) 
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In that case, Eq. fl2.27l ) is the only remaining equation and it determines the critical value a R . 
Thus, the condition determining P s = Q R in this approximation is the stationarity condition 

<W>a* = °> ( 2 - 3 °) 

for the finite set of moment-functions ijjf ,i = 1,...,N L . In that case, the variational method 
does not differ from the projection of the dynamics onto a finite set of moments. If one permits 
a more general dependence of ty L on the parameters a than the linear ansatz Eq.( |2.23 ), then 



the variational method does not generally coincide with moment projection. However, we see 
no advantage at this point to allowing a nonlinear dependence on a L . 

It is possible to obtain the moment projection condition in a slightly more general form, 
i.e. so that the moments depend upon the same set of parameters a: as the trial state 
yf! R = ^^(q). Formally, we take N R = 0, N = N . We may obtain for the N parameters a 
determining equations of the form 

(tf{a),L*f(a)) = 0, (2.31) 

or, equivalently, 

(Ltyf(a)) a = 0. (2.32) 
This is accomplished by making the variational ansatz 

N 

f L = l + ^afV>f(<*) & y R = $> R (a). (2.33) 

i=l 

There may be some advantage in permitting the moments to vary along with the trial state. 
Hence, this more general version is worked out in the Appendix. 

A simple example of such ansatz as discussed above may be devised based upon a trial 
weight w = w(x), which is a normalized probability density, and an adapted set of orthogonal 
polynomials p n (x): 

J dxw(x)p n (x)p n '(x) = 5 nn i. (2.34) 
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See p9|, p|. A natural form of the trial ansatz then takes N R = N L (= N) and 

AT-l 

vf K (x; a R ) = w( X ) ■ «nPn(x), (2.35) 

n=0 

and 

N—l 

^ L (x;a L ) = ^ a£p n (x). (2.36) 

n=0 

This ansatz is a simple case of the type of Eq,( |2.13| ), with no common parameters. Here the 
stationarity condition becomes simply 

L N a R = & aiL N = 0, (2.37) 

with 

(L N ) nn , = {p n ,L(w Pn ,)}. (2.38) 

for < n, n' < N. In other words, the a R and should be, respectively, right and left 
eigenvectors of the matrix L^r with eigenvalue zero. It is easy to check that a left eigenvector 
of Lat for the eigenvalue zero always exists and is given simply by 

at = 5 nfi . (2.39) 

It is possible to generalize the orthogonal polynomial ansatz by choosing the trial weight w(a), 
depending upon some additional M parameters ctj, i = 1, ...,M. In that case, the adapted 
orthogonal polynomials will depend also upon a. After initial variation over a R ,a L , a second 
variation may be made to optimize the choice of a. 

An advantage of the orthogonal polynomial scheme is that it may converge in the limit 
N — > oo: for an example, see [plf| . Some sufficient conditions for convergence are discussed in 
|i"T|l . It is necessary for convergence that J dx < oo [ p9| . Unfortunately, the expansion 

ansatz Eq. fl2.35 ) for the state need not be positive at all values of x. Instead, realizability can 



be guaranteed by making an ansatz 

^ R = w(a,a R ), (2.40) 
11 



in which 



w(x;a,a R ) > 0, J dxw(x; a, a R ) = 1. (2.41) 
This assures realizability whenever such an ansatz, along with Eq.( 2.23| ), yields an "acceptable" 



critical point. The criterion of realizability is especially important for a few parameter ansatz, 
incorporating certain physical insights and ideas, as a test of those beliefs. On the other hand, 
for the case where ./V — > oo, it may be preferable to impose the criterion of convergence. This 
might be done even at the price of loss of realizability, if convergence for a statistic of particular 
interest is rapid enough. The dual criteria of realizability and convergence ought to be regarded 
as complementary in their applicability. 
(ii) Time- Dependent Distributions 

We first observe how the evolution equation Eq.([0]) may be formulated variationally. Let 
us define 

T[V R ,y L }= dt {^ L (t),(d t -L)^ R (t)), (2.42) 
J o 

as a functional of "trajectories" *& H (t), H = L, R. We refer to this functional as the nonequi- 
librium action. It is easy to see formally that the stationarity condition 

5r[V R ,V L ]=Q, (2.43) 

is equivalent to 

(dt-L)\* R (t))=0 & (d t + D)\^ L (t))=0, (2.44) 

the variation being performed with the constraint 

(^ L (oo), *(oo)) = (^ L (0), y R (0)). (2.45) 

In other words, a pair of trajectories is an extremal point of the action if and only if the "right 
trajectory" is a solution of the evolution equation Eq. ( |1 . 1[ ) and the "left trajectory" is a solution 
of the adjoint equation, subject to the "endpoint constraint" Eq.(2.45). It is important to note 
a particular exact solution of the adjoint equation 

^ L (x,t) = l. (2.46) 
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In that case, the endpoint constraint becomes 

J dx <£? R (x,oc) = J dx $ R (x,0), (2.47) 

which is automatically satisfied by any solution of the evolution equation. In other words, 
^ L {t) = 1 together with any solution ty R (t) of the evolution equation provides an extremal 
point of the action ^ L ]. In this important special case Tf^^, \P ] = 0. We may note the 

equivalent form of the nonequilibrium action 

T[^ R , V L ] = J™ dt (^> L {t), i> R {t)) - H[V R {t),y L (t)]) , (2.48) 

which shows that \E ,i is formally a momentum 11^ canonically conjugate to ^ R . In that case, 
the evolution equation and its adjoint are formally restated as "Hamilton's equations" 

**(x) = g^Hp* * L ] & * L (x) = _^*_7<[** (2.49) 
This makes it obvious that the Hamiltonian is invariant along an extremal set of trajectories of 



the action Eq.(|2.48|). 



In the same manner as for the stationary case, we may use the previous variational principle 
as the basis of an approximation method for the time-dependent PDF. The basic idea is similar 
to time-dependent variational principles of standard use in quantum mechanics [Q, |3|, going 
back to the early work of Dirac and Frenkel [||] . The procedure is initiated by making trial 
ansatz for the trajectories, in the form 

V H (t) = V H (a(t)) (2.50) 

with H = L,R. In other words, the reduction to finite number of degrees of freedom is made with 
the same functional form as for the stationary case and all of the time dependence is contained 
in the parameters a(t). This is the same idea as in the general method of parametric PDF 
closure, except that we here derive equations for the closure parameters variationally. Indeed, 
we may substitute the trial trajectories into the action to obtain a reduced or parametric action 

POO 

r[a]= dt[n(a(t))ai(t)-H(a(t))], (2.51) 
Jo 
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with 

dai 

The Euler-Lagrange equations of the variational principle have the special form: 

dH 



{ai,aj}ctj 



in which 



(2.52) 



(2.53) 



(2.54) 



This is an infinite-dimensional generalization of the Lagrange bracket of classical mechanics; see 



and |3l[, p. 250. It is easily checked to have the properties 



and 



{aj 3 ai} = -{ai,ctj}, 



(2.55) 



(2.56) 



Let us first verify the stated form of the Euler-Lagrange equations Eq. fl2.53 ). The verification 
follows from the result that 



S 
5a 



- / dt7Ti(a)ai = {ai,ctj}aj. 
i J 



By a simple calculation 
5 



dt7Ti(a)ai = -— , - — )aj + 
OCt; J OOLi oa^ 



daidctj ^ ' ^ dt i ^ 



However, 



dt 1 dctj ' da i 3 ' daidaj'^ 1 



a,-. 



(2.57) 



(2.58) 



(2.59) 



This yields Eq.fl2~57l). The property Eq.fl2~55p of La grange brackets is obvious. Eq.( |2.56j ) follows 
from the expression Eq.( [2.54 ) by a simple calculation. 

If the matrix of Lagrange brackets ({a^oy}) is non-degenerate, that is, det ({«i, ay}) ^ 0, 
then we may introduce a corresponding Poisson bracket [a^ay] as the elements of the inverse 
matrix: 

([a i ,a j ]) = {{a i ,a j })- 1 . (2.60) 
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It is straightforward to show that the Poisson bracket has properties implied by those of the 
Lagrange bracket, Eqs. fl2.55D , Q2.56 ), namely: 



[aj,ai] = -[ai,atj], (2-61) 

and 

[cti, [aj,a k ]] + [ay, [a k , a*]] + [a k , [a», aj]] = 0. (2.62) 

The latter is the well-known Jacobi identity. The bracket may be extended to arbitrary functions 
/ and g of coordinates a via the definition 

p,q UUi p ULX q 



With this definition, the Poisson bracket satisfies Eqs.(2.61) and ( 2.62j ) for all functions. Note 



that the Jacobi identity for general functions follows by the argument of p. 257. The 
parametric equations may then be written as 

a i = [a i ,H], (2.64) 

which are in Hamiltonian form. In general canonically conjugate variables do not exist for this 
Hamiltonian (i.e. the system is noncanonical Hamiltonian). Notice that the Poisson brackets 
[a,, aj] of the system depend only upon the parametrization (i.e. the trial ansatz) and that the 
dynamics enters solely through the Hamiltonian 7i(a). We now see very simply that the fixed 
points of the parametric evolution equations coincide with the critical points of the correspond- 
ing Hamiltonian. Q Furthermore, the parametric Hamiltonian is an integral of motion for the 
evolution equations. Notice that, if the non-degeneracy condition failed at finite time, then the 
solutions themselves to the parametric equations might become ill-defined. 

A case of special interest is that in which ^> H = ^ H (a H ), H = L, R, with an equal number 
of a R and a L parameters. Observe that the Lagrange brackets are now given simply as 

{at,af} = (tf(a L ),i,?(<x R )) (2.65) 



4 Even without the non-degeneracy condition the fixed points would include all of the critical points of H, 
although there might be additional fixed points. 
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and 

{<x?,aj} = -{^{cc L )^ R {cc R )). (2.66) 
with all other brackets vanishing. It is easy to check that the variables ir R introduced as 

tt/W a L ) = (^ L (a L ), JL* R ( a R )), (2.67) 

satisfy 

[a R ,n R ]=5 t „ (2.68) 

that is, tv r is the momentum canonically conjugate to a R . If ir R {a. R ,ct L ) = ir R is invertible 
at each fixed a R for a L in terms of -k r and ot R , then by a change of variables the system has 
canonical Hamiltonian form. 

As in the static case, there is a criterion of "acceptability" of solutions, which requires that 
Hf L (t) ps 1 for all time t. Let us consider first for simplicity the previous special case with 
q,H _ \$H^ a H^ H = L,R. Just as for the statics, we are motivated to adopt the linear ansatz 

N 

tf L (x;a L ) = l + 5>i>ftx). (2.69) 
i=i 

In this case, the equations for a L (t) become: 

- <^?(a*))df = af{fPf,Uf(ct R )), (2.70) 
z = 1, ...,-/V, which have as an exaci solution 

a L (t) = 0. (2.71) 
Within this same ansatz the equation remaining to be solved for ot R (t) reduces to: 

(a R ))af = (tf,L* R (a R )). (2.72) 
For this further simplification is possible by introducing moment-averages 

mi (cx R ) = (tf) aR , (2.73) 
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and the dynamical vector 

Vi( a R ) = (L^) alt . (2.74) 



Because {a L ,a R } = ^^(a R ) for the ansatz Eq. (|2.69[ ), it follows that 



{af,a R }af = = m- (2-75) 

Therefore, the equation of motion Eq.( |2.72| ) expressed in terms of the moments m becomes 
simply 

m = Vi(m), (2.76) 

where V(m) = V(ct(m)). In this way we see how "moment-closures" as they have been tradi- 
tionally employed in nonequilibrium dynamics are obtained in our scheme. Closure is achieved 
by calculating all averages with respect to the PDF ansatz P(x, t) = ^ fi (x; et R (t)) and then 
eliminating the parameters a R (t) in terms of the (equal number of) moments m(t). As we shall 
discuss in the next section, this variational method of moment-closure has definite theoretical 
advantages. 

More generally, we may employ the ansatz Eq.( p.33| ), ^ L = 1 + J2iLi a f ^(oO & ^ R = 
ty R (a), allowing for some parameter dependence of the moment-functions ijjf{a). This choice 
is considered in the Appendix, so we here just report the results. As with the case previously 
considered, it is not hard to check that Q. L (t) = is an exact solution of its equation. The 
remaining equation for a takes the form 

{c4',a j }a j = V i (a) (2.77) 



with 



generalizing Eq.( 2.74 ), and 



Vi(a) = {tftf(a)) a , (2.78) 



r r d^ R 

{<4,a j } = (irf(tx),—(a)). (2.79) 
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By an easy calculation one can see also that 



K L >«j} = ii?-<#(«)>a " <§r(«)>«- ( 2 - 80 ) 



Comparison with Eq.(2.11) in the work of Bayly [28| reveals that the Eq.( p.77 ) obtained via the 



ansatz Eq.( 2.33| ) is equivalent to the dynamical equations obtained by "moment-projection" in 



the parametric PDF closure scheme. Here these equations are simply shown to have a variational 
formulation. 

As in the static case, a useful ansatz is provided by a fixed trial weight w;(x) and orthogonal 
expansions 

N-l 

* R (<X R ) = W ■ "nPn, (2-81) 
n=0 

and 

JV-l 

* L (a L ) = «„V. (2-82) 

n=0 

In that case it is easy to calculate that 

{a£, a R } = -{a R , a L n } = 5 nm , (2.83) 

and that 

H(a R , a L ) = a^(L N ) nm a R . (2.84) 

n,m 

Therefore we see that a. R and tv r = a L are canonically conjugate, and the parametric action 
is a quadratic form 

T[a R , a L ] = J dt [a L -a R - a L -L N a R ]. (2.85) 
In consequence, the evolution equations are linear 

a R = L N a R & a L = -a L L N , (2.86) 

for this particular ansatz. The second equation has exact solution a^(t) = 5 n Q. The first 



equation is a standard Galerkin truncation of the linear Liouville dynamics, Eq.(l.l). 
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3 Constrained Variation and Effective Action 

(i) The Principle of Least Effective Action 

For spatially extended systems, or for any system with large numbers of degrees of freedom, 
it is certainly too ambitious to try to calculate the full PDF. Such a calculation would put any 
trial ansatz to an extremely severe test and could hardly be expected to succeed, in general, with 
a few number of parameters. In any case, the physical interest is usually in some special low- 
order statistic, such as a mean field or a correlation function. Such quantities are represented 
by random variables z on the microscopic phase space, that is, by functions z = z(x) of the 
dynamical variables x. In practice one will be mostly interested in some simple low-order 
moments of the dynamical variables x themselves, e.g. z = x, x (g) x, etc. It should be possible 
to successfully calculate a statistic of this type with a simpler ansatz with just a few parameters, 
if those are insightfully chosen. However, the variational method, as we have described it so 
far, allows one to calculate such a low-order statistic only as the by-product of calculating the 
full distribution. One would like to have a more direct variational method for any statistic of 
interest. 

In fact, it is well known in various contexts that such statistical quantities as expectations, 
correlations, etc. are characterized by a minimum principle for a certain functional. In (Eu- 
clidean) field theory this functional is called the "effective action," and was first rigorously 



investigated by Symanzik in [17]. In nonequilibrium statistical mechanics the variational prin- 
ciple associated to the effective action was pointed out some time ago by Graham |jl6(| . The fact 
that averages of suitable distributions are characterized by a minimum principle is also stan- 
dard in probability theory: see Section 3 of Such a principle has a very general basis and, 
indeed, its origin is the same as that of the familiar equilibrium variational principles of maxi- 
mum entropy, minimum free-energy, etc. Closely related ideas have been exploited recently to 



develop moment-closure hierarchies for kinetic theories [33|. We shall give here a self-contained 



discussion of the least-action principle, following the accounts in jlT], 32]. 
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The main requirement for its validity is finite exponential moments of the statistical distri- 
bution. Let us denote by V the probability measure on histories of our stochastic dynamics. 
Thus, Pt is just the projection (or, marginal) at time t of the distribution V . Then, what is 
required is that, integrating over the ensemble of histories {x(i) : — oo < t < +00}, 



J W{x) e (f ' z(x)) < 00, (3.1) 



where i(t) is a real-vector valued test function and (f, z) = J dt f(i)-z(i). If Eq.( |3Tl|) holds, we 
may define 



W[f] = log 



J DP(x) e (f ' z) 



(3.2) 



which is a cumulant-generating functional of the distribution V. It is a consequence of the 
positivity of the distribution and the Holder inequality that 

Jw{x)e^ 1+ ^- x ^^ < (Jw^e^A (J DP(x) e (f2 ' z) \ * A , (3.3) 

for < A < 1, or 

W[\h + (1 - A)f 2 ] < AW[fi] + (1 - \)W[h\. (3.4) 

In other words, W[f] is a globally convex functional of its argument. Observe that this is a result 
just of a simple realizability inequality for the distribution V . The corresponding conjugate 
convex functional is 

r[z] = sup((f,z)-W[f]). (3.5) 
f 

This is the definition of the effective action for z-histories. Since T[z] is also globally convex 
under the assumption Eq. Q3,l| ), it follows that it has an absolute minimum (possibly nonunique 
if r is not strictly convex). In fact, 

T[z] > & r[z] = 0, (3.6) 

where 

z(t) = J W(x)z(x(t)). (3.7) 
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The positivity of T follows from the fact that (f, z)— W[f] = in Eq.(|3.5|) for f = 0. Furthermore, 
by Jensen's inequality log J PP(x) e' f,z ' > (f,z). Thus, (f,z) — W[f] < for all f, and so 
r[z] = 0. That the mean is characterized as the point at which T achieves its minimum is just 
the precise statement of the principle of least effective action. 

All the derivations we have given for the distribution on histories, V, could just as well 
be given for the single-time stationary distribution, P s . However, since the latter is hard to 
specify, it is easier to work with a quantity derived from the effective action introduced above, 
which is commonly referred to as the effective potential. This is obtained from the full action 
by defining, for any time-independent z, the time-extended history zy(i) by 

[ z if < t < T 

z T (t) = I _ (3.8) 

I z otherwise. 

Then the "effective potential" V[x\ is defined as the infinite-time limit 

V[z] = lim (3.9) 

The effective potential is appropriate to determine expected values in the time-invariant ground 
state of the theory S1 R = P s . 

The effective potential has a direct significance in terms of the statistics of the empirical 
time-average: 

z T = 1 [ T dt z(t). (3.10) 



Tjo 

For an ergodic process, this random variable converges as T — > oo to the ensemble-average, 
z~t —> z, almost surely in every realization. However, fluctuations away from the expected 
behavior should furthermore occur with a small probability, decaying asymptotically for large 
T as 

Prob(z T « z) - exp(-T- V[z\) . (3.11) 

This is a refinement of the standard ergodic hypothesis. It will hold when the limit in Eq.( |3.9D 
exists, or, equivalently, if the similar limit, limr^ +00 iPFfhy] = A[h], exists. These are stan- 
dard results of "large deviations" in probability theory p4| , |35| ]. In fact, what is in physics 
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referred to as the "effective potential" coincides for stochastic dynamics with the (level-1) rate 
function in the Donsker-Varadhan large-deviations theory for ergodic Markov processes. The 
probabilistic interpretation of the effective potential seems to have been first pointed out in 
quantum field theory by Jona-Lasinio |3(|. Such a large-deviations hypothesis as Eq.( |3.li ) was 
conjectured some time ago by Takahashi for deterministic dynamical systems with sufficiently 
chaotic solutions [37|, and rigorous theorems have been proved under suitable hypotheses (e.g. 
see p8| , |39|). In this context the effective potential is simply related to the Kolmogorov-Sinai 
entropy. The earliest origins of the above fluctuation hypothesis in statistical physics appear in 
the 1931 "Onsager principle", as discussed by Oono in |40(| . 

It follows from our assumptions that the effective potential is nonnegative, V(z) > 0, convex, 
AiV(zi) + A2^(z2) > V(X\Zi + A2Z2), Ai + A2 = 1, and vanishes only at the ensemble-mean, 
V(z) = 0.0 In the next section we develop a practical method for approximately calculating the 
effective potential. Because of the connection of the effective potential with fluctuations of the 
empirical mean, Eq.( |3.11 ), it is very unlikely that a closure approximation which violates the 
basic positivity and convexity properties of the effective potential can yield a reasonable result 
for the ensemble-average itself. 

(ii) Variational Characterization of Effective Potential 

We now show how the effective potential V(z) is related to the Hamiltonian TL\fy R ,^ L ] 
discussed before by means of a constrained variation. A similar result was proved by Symanzik 

in Euclidean field theory jl?]]. In our case, a modification is required associated to the non-self- 

5 The structure of the effective potential may be more complex if there is "ergodicity-breaking" associated 
to multiple ergodic measures. In that case, there may be a convex set of points z with nonempty interior on 
which V(z) vanishes. This would be the case if a so-called non-equilibrium phase-transition occurred. The 
important applications of the effective potential in quantum field theory appeared precisely in this type of 
situation, where basic symmetries of the quantum Hamiltonian are spontaneously broken by the occurrence 
of multiple ground states. Similiar phenomena may be expected in infinite-volume nonequilibrium systems, 
especially in the parameter range after the first bifurcation from a unique laminar solution but before transition 
to fully-developed turbulence has occurred. 
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adjoint character of L. More precisely, we have 
Theorem 1 The effective potential 

V[z}= lirn ir[z T ], (3.12) 
for a stationary Markov process is the value at the extremum point of the functional 

V[V R ,V L ] = -H[9 R ,9 L ]. (3.13) 
varying over all pairs of state vectors *S> L subject to the constraints 

(y L ,y R ) = l (3.14) 

and 

(ty L ,Z ■ = z. (3.15) 

Here Z is the operator of multiplication by z(x). Although the original version of the theorem 
required just one trial state, there now must be two independent trial states. 

Nevertheless, the proof is similar to the original one of Symanzik 0]. Let £l R = P s , VL L = 1. 

Then the generating functional W\h] introduced above may be represented in the operator 
formulation by 

W[h) =log(ft L ,Texp (j^dtL h (t) \ -Q R ), (3.16) 



where T denotes time-ordering (increasing right to left) and 

L h (t) = L + h(i)-Z. (3.17) 

No time-dependence is required for the coordinate operators because the exponential factors au- 
tomatically introduce the correct Heisenberg picture operators after differentiating and setting 
h to zero. We note then that for a static field h in the limit T — > +oo, 

exp(Ty [h r ]) = (V L , exp (t ■ L h ) ■ Q R ) 

« (n L ,n R [h])(n L [h],n R ) xexp(T ■ \[h}), (3.18) 
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where A[h] is the eigenvalue of the "perturbed operator" 

L h = L + h-Z (3.19) 

with the largest real part and f^fh], S7 L [h] are the associated right and left "ground state" 
eigenvectors: 

L h |fi fl [h]) = A[h]|n*[h]>, (3.20) 

and 

il\n L [h}) = x*[h]\n L [h]}. (3.21) 

Furthermore, we can see that 

^M=T.z n [h]+o(T), (3-22) 

with 

z n [h] = (n L [h],z n -n R [h}). (3.23) 

This can be obtained from the formula 

eMW[h T ])^M = (^,-^ex P (T.L h ).^) 



= {n L ,n R [h])(n L [h],n R )(n L [h],^-exp{r ■ i h ) -n R [h]) 

+0 (V TAA ) , (3.24) 

where AA is the spectral gap between the real parts of the "ground state" eigenvalue and the 
next highest eigenvalue. We have used the well-known fact that, for any one-parameter family 
of operators L(h) depending smoothly on a parameter h, 

'dL(h) 



J- exp(L(/i)) = exp(L(/i)M-AdL(/0) . )h 
where AdL denotes the "adjoint operator" defined by the commutator, 



(3.25) 



(AdL)[d] = [L,6], (3.26) 
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and <p(z) is the entire function ip(z) = (e z — l)/z = 1 + + ^z 2 ■ ■ ■ . See pl|. Since 

(n L [h],[L h ,d]-n R [h]) = o, (3.27) 

for any operator O, only the first term survives in the expansion of <p when substituted into the 
first term of formula Eq.(3.24). This yields Eq.( |3.22j ). 



Now let us consider the variational problem. If we incorporate the constraints by suitable 
Lagrange multipliers, then the variational equation is just 



(^ L ,L ■ $> R ) - h-(^ L ,Z ■ ^ R ) + \(V L ,V R ) = 0, (3.28) 



or 



(dV L , (L h - A) ^ R ) + L , (L h - A) 5V r ) = 0. (3.29) 

In other words, there are infinitely many stationary points of the functional Vf*!^, ty L ] subject 
to the constraints. They consist precisely of pairs (^^[h], ^[h]) of eigenvectors of Lh, 

L h \* R [h})=\ 1/ [h}\V R [h}), (3.30) 

and 

L h |^[h]) = A:[h]|^[h]), (3.31) 

corresponding to different branches of eigenvalues A^[h], v = 0, 1, 2, ... To be precise, we should 
consider the stationary point corresponding to the branch with largest real part for each h, that 
is, the pair of "ground state" eigenvectors (fi^[h], £2 [h]) introduced above. For small enough 
h this corresponds to the eigenvalue branch with A(0) = 0, because the spectrum of L is all in 



the left half of the complex A-plane, ReA < 0, except for a simple eigenvalue at A = 0. See [25 



and [26]. We refer to this as the "zero-branch" of eigenvalues. 

Applying then the left eigenvector to the eigen-equation of the right vector and using the 
constraints gives 

(n L [h],L-n R [h}) + h-z[h] =A[h], (3.32) 
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and thus 



h-z[h] - A[h] 




[h T ])-W[h T ] +o(l) 



(3.33) 



The first quantity is independent of T, so that we see taking the limit T — > +00 that 



{n L [h],L- n 



[h]) = V[z] 



(3.34) 



as was claimed. □ 

We have given only a formal proof of the theorem without a careful statement of the condi- 
tions, which would certainly involve spectral properties of the "Liouville operator" L, etc. The 
assumption of a spectral gap may be stronger than required. The above variational charac- 
terization of the effective potential is, in fact, equivalent to a spectral characterization of the 
potential which has been rigorously established in the Donsker-Varadhan theory (35|, |34], |32fl . 
In that case it is shown, under suitable conditions, that V[z] = sup^ (z-h — A[h]) where A[h] is 
the "principal eigenvalue" of the operator = L + h-Z. The equivalence of these two char- 
acterizations follows from the preceding formal proof. The representation of the potential V[z] 
as a Legendre transform of A[h] is entirely analogous to the representation of the entropy in 
equilibrium lattice spin systems as the Legendre transform of the free-energy, where the latter 
is determined as the leading eigenvalue of the transfer matrix. For deterministic dynamics the 
existence of a spectral gap in the so-called "Perron-Frobenius operator" has been established 
only for a few special cases, such as the work of Pollicot and Ruelle on Axiom A systems 
p4j . The eigenvalue A[h] in that context is a particular case of the topological pressure P(<p): 
see [^] (or [ [i"3| ] for an introduction). For example, in the work of Ruelle on expanding 
maps / of compact spaces X, the effective potential would coincide with P((p) for the choice 
</?(x) = — In |/'(x)| + h-z(x). Here |/'(x)| is the Jacobian determinant of the map, and its log- 
arithm, In |/'(x)|, is the "Hamiltonian" in the thermodynamic formalism for expanding maps. 
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(Hi) Rayleigh-Ritz Approximation of the Effective Potential 

We outline a simple variational method of Rayleigh-Ritz type to approximate the effective 
potential and, thereby, the ensemble means. The ansatz used previously for ty R , ^ L may need 
to be replaced by "augmented ansatz" ^ , ^ . The reason is that the left ground state, under 
the imposed constraint, is no longer 1 identically and the constant component must be allowed 



to vary. In other words, we must augment the linear ansatz Eq.(2.32) for the left ground state, 
by setting 



N 

-L, 



Here the test function 



*V,a L ) = ]>>fV>f(a), (3 - 35) 
i=0 



^(x;a) = l (3.36) 



is included with an adjustable parameter . Of course, with the orthogonal expansion ansatz 
Eqs.( |2.35| ) ,( |2.36[ ), the constant term (zero-degree polynomial) is already included. However, if 
it was not originally, it should now be added, and an additional free parameter ao should be 
added to the PDF ansatz P = ^ R (a) as well. The most natural way to do so is to simply 
replace the normalized density ^ R > by 

^ R (x; a) = ao* fl (x; a), (3.37) 

where «o denotes an arbitrary normalization factor: 



/ 



dx * R (x;q) = q . (3.38) 



Because ^ ^ 1 under the constraint, unit normalization of \P is no longer required, but, in- 



stead, the overlap condition (<!' ) = 1 must be maintained. Notice that we use the notations 
7x, a simply to indicate the parameter vectors a, a L along with the additional zero-components 
aO) a o- We shall refer to the new ansatz Eqs.( ^37 ),( 3.35| ) as the natural augmentation. While 



others can be contrived, this is the simplest extended ansatz and likely to be the most generally 

useful. Note it is not necessary to have a closed- form expression for ^S R , but it is enough only 
6 Despite this, some of our arguments below do not apply to the natural augmentation! We will point out 
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to be able to calculate averages such as 

ffiiia) = (#(a))or. (3.39) 

and 

F i (a,h) = (4^(a)) s , (3.40) 

with i = 0,1, ...,N. In the most practical PDF closures, the ansatz ty R (x.;a.) will be given, 
not explicitly, but instead by averages with respect to "surrogate" random variables Xq whose 
distributions are parametrized by a. From the joint ansatz for , H = L, R, an approximation 
to the effective potential is then obtained: 

^(z) = -<*f,L*f>, (3.41) 

where ^ = ^ (cE*(h), a^(h)) & = ^ (cE*(h)), and the parameters cE^(h), a*(h), and 
h = h*(z) are to be determined as follows. 

Incorporating as before the constraints by suitable Lagrange multipliers A and h, the ex- 
tremum point within the ansatz is obtained by varying the function 

F(a,a L ) = -(y L (a,a L ),L h y R (a)) + \($ L (a,a L ),V R (a)), (3.42) 

of the parameters a,c£ L . First, by variation of the a-parameters, one obtains the equation 

A(a,h)-a L = XB(a)-a L (3.43) 
with the matrices A (a, h) and B(a) defined by 

A ij (a,h) = ^(a,h) (3.44) 



where this occurs in the later discussion. This is really a technical issue, since all of the results discussed hereafter 
still hold for the natural augmentation and it is only the proofs which need to be changed somewhat. Rather 
than complicate the discussion, we have decided to present proofs under the simplest assumptions. These are 
satisfied, for example, by the orthogonal expansion ansatz. The natural augmentation is discussed in detail 
elsewhere Ml. 
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and 




for i, j = 0, 1,...,N. Eq. (|3.43| ) has the form of a generalized eigenvalue problem [24, [5|. The 



parameter vector ct (cE, h) is to be determined as the generalized eigenvector associated to the 
"leading" eigenvalue. 

However, the proper definition of this last quantity requires some discussion. In the original 
infinite-dimensional setting, the "leading" eigenvalue was defined to be that with largest real 
part and, for h small enough, it coincides with the "zero-branch" passing through for h = 0. 
On the other hand, within an approximation such as we consider here, these two quantities need 
no longer coincide, although both exist. An eigenvalue branch A(cE, h) such that A (a, 0) = 
exists always with the associated eigenvector af = 5^ at h = 0. Likewise, an eigenvalue with 
a real part — denoted A(cE, h) — of largest value will certainly exist. Because the two quantities 
A (a, h) and A (a, h) are possibly distinct, either may be plausibly used as the basis of an 
approximate calculation. However, there are compelling reasons to prefer the use of A(cf, h). 
Most importantly, it is only due to A (a, 0) = that a*(0) = a* coincides with one of the 
fixed points of the h = vector field V(a) (see below). Also, as a practical matter, it will 
generally be easier to compute X(a, h) than A (a, h), whose calculation requires a determination 
of the entire spectrum of A (a, h). Actually, all of these considerations are rather academic. If 
A (a, h) > A (a, h) = at h = 0, then the stability matrix |=(a) = [A (a, 0)] T has an 
eigenvalue with positive real part. If this were to occur at the starting point a*, that point 
would be linearly unstable under the dynamical flow of the vector field V(ct). That alone 
would be enough to disqualify the point a* from physical interest. On the other hand, if 
A (a, h) = X(a, h) at h = 0, then, except for degenerate cases, this will also be true in a small 
interval of h about and no distinction need be made. It will be explained below that the 
approximate potential V* (z) calculated from A (a, h) necessarily has the approximate mean 




(3.46) 
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as a critical point, with K*(z*) = 0, but that V*(z) need no longer be convex at z*. 

Returning, then, to the specification of the approximation scheme, we next determine a*(h) 
as the value of a satisfying the variational equation under the parameters a L : 

Vi(a, h) = A(cE, h)mi(a), (3.47) 

i = 0, 1, N. This may be thought of as a type of "nonlinear eigenvalue condition" and cE*(h) 
as the associated eigenvector. Since A(a, 0) = 0, it is a consequence of this definition that 

a*(0) = a*, (3.48) 



with a* a fixed point of the dynamical vector V(q) defined in Eq.( |2.74 ). As long as the stability 



matrix ^=(0:*) is non-singular, the implicit function theorem guarantees that Eq.(3.47) has a 



solution for at least some small interval of h about 0. Q For practical computation, a Newton- 



Rap hson or other root-finding algorithm may be employed (see [47], Ch.9), starting with a* at 



h = and tracking a sequence of roots a*(hfc) iteratively for h/% of increasing magnitude. If the 
starting ansatz , has more than one acceptable fixed point, then any of them may be used 
as a basis for the calculation. Next, cfj(h) is defined as a L (cc*(h), h) with its normalization 
fixed by the constraint , 'J/* } = 1. This allows one to define the function 

z*(h) = <* L (c^(h),a*(h)) , Z-**(a*(h))), (3.50) 

and to determine h thereby as the value h^(z) of its inverse function at x. It should be remarked 

that both Z**(h) and a^(h) are real vectors, at least for small enough h, and therefore z*[h] 

is a real-vector too. The eigenvalue X(a, h) will be real for h sufficiently near and, in that 
7 This is the property which is not satisfied by the "natural augmentation." In fact, it is not hard to show 
that with that choice 

dv I 

(5*) = • (3.49) 



da 



&*) 



So 

Clearly, this matrix is singular. However, as we have already noted, it is only the present proofs which fail and 
the results themselves, proved here assuming non-singularity, still hold for the "natural augmentation" Eq|. 
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case, the associated generalized eigenvector a L (7x, h) for the real matrices A(a,h),B(a) will 
also be real. We observe for h = that z*[0] = z*. 

These prescriptions complete our recipe for the Rayleigh-Ritz approximation to the effective 
potential V(z). We now establish an important representation for V*(z). Let us define 

A*(h) = A(a„(h),h), (3.51) 

in terms of the quantities introduced above. We now prove 

Proposition 1 The approximate effective potential V*(z) is a formal Legendre transform of 
A*(h); that is, 

*-(h) = z*(h) (3.52) 



and 

for h = h*(z). 
Proof. Setting 

and 



dh 



K(z) = z*(h)-h- A*(h), (3.53) 



N 

-L 



<(x;h) = £o£(h)^(a,(h)), (3.54) 



i=0 



tf?(x;h) = tf(x;a,(h)), (3.55) 



we observe the overlap condition (h), ^ (h)) = 1 becomes simply 

N 

^a^(h)m i (cE*(h)) = l. (3.56) 

i=0 

We next show that 

(^(h),L h -*?(h))=A,(h). (3.57) 

In fact, 

N 



(^(h),L h -*f(h)> = £a£(h)^(a*(h),h) 

i=0 
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TV 



A*(h)^a^(h)mi(a*(h)) 
A*(h), 



(3.58) 



where the first line follows using the linear ansatz, Eq.( |3.54 ) above, the second line follows 



from the "nonlinear eigenvalue condition" Eq.(3.47), and the last line follows from the overlap 
condition Eq.( 3.56j ). Now it is easy to see that 



K(z) 



-(^(h),L-¥?(h)) 

<^(h), Z • C(h))-h - (fj(h), L h • ¥?(h)> 
z*(h)-h - A*(h), 



(3.59) 



which is Eq.( p.53 ). 

The verification of Eq.( 3.52] ) is a straightforward but somewhat tedious calculation. Using 
once more the basic expression Eq.( 3.5?1 ) for A*(h), one finds by differentiation that 



^(h) = z*(h) + (^(h),L h ■ *? (h)> + <^(h),L h • ^ 



(h)>. 



(3.60) 



Furthermore, calculation yields for the second term 

(-^(h),L h • *f (h)) = £ ( 7^««(h) ) A*(h)ffii(a*(h)) 



i=0 



+ £ ««(h)(4^(«*(h)))a, ( 
i,i=o 



9a 



(h), 



(3.61) 



set,- dh 

where the "nonlinear eigenvalue condition" Eq. ( |3.47| ) was used in the first sum on the righthand 
side. Likewise, for the third term in Eq.( 3.60| ) 



N 



«(h),L h • ^f(h)) = E««( h ) • A*(h) ( ^m,(a*(h)) 

8=0 

AT 



E «*i( h )(4^(«*(h)))a,( h )^^) 



«(h) h 



(3.62) 



i,i=o " 3 

where the generalized eigenvalue equation Eq.( 3.43| ) was used in the first sum on the righthand 
side . Adding the two contributions, the last terms of each cancel and the result is 



0*? 



(^(h),L h .*f(h)) + (^(h),L h ^ 



(h)) 
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N r 

£ 

8=0 



A*(h) 



0. 



dh 



mj(a*(h)) 



d_ 
dh 



N 



^a^(h)m 4 (a*(h)) 



Li=0 



(3.63) 



The constant overlap, Eq.( |3.56| ), was invoked in the last line. Thus, ^f(h) = z*(h). It may 
be worth remarking that this result is a nonlinear generalization of the Hellmann-Feynman 
theorem used in quantum-mechanical perturbation theory. □ 



It is a consequence of this proposition that 



dV 

K(z*) = & 

az 



(3.64) 



Indeed, since z*(0) = z* and A*(0) = 0, the first follows directly from Eq.(3.53). For the second, 
we use the simple result of Eq. ( |3.53j ) that 



— (z) = h*(z) 



(3.65) 



and h*(z*) = 0. Hence we conclude that the properties Eq.( |3.64 ), which hold for the exact 
effective potential, are automatically guaranteed to hold in the Rayleigh-Ritz approximation. 
However, the important property of convexity of V*(z) is not guaranteed. All that can be 
inferred from Eq. fl3.53 ) is that V*(z) is convex in z if and only if A*(h) is convex in h. 

Let us first, however, note a useful simplification. Just as was discussed in Section 2.ii, it 
is very convenient here also to replace the parameters a by the moments m. Assuming that 
the matrix B(a) = |= defined in Eq.( |3.45| ) is nonsingular, then the relation m = rn(ct) may 
be inverted, at least locally, to give a(m) as a function of m. Therefore, the m may be used 
as parameters instead of the a, writing as well ip L (m.) = tp L (a(m)),^> R (ui) = $ (c£(m)) 
without any possibility of confusion. In this case, the equation obtained under variation of the 
m-parameters reduces to an ordinary eigenvalue problem: 



A( m. h)-a L = X-a L 



(3.66) 
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with the matrix A(m, h) defined similarly as before: 



Aij(ui,h) = 



dm,i 



Vj(m,h) 



(3.67) 



and 



V^m, h 



.) = (4^(m)) S) 



(3.68) 



Once again, A(m, h) may be taken as the "leading" eigenvalue and a L (m, h) its associated 
eigenvector. Likewise, an equation may be obtained for m^h) by varying a L , which is now 
simply 



With these additional simplifications, the procedure to calculate V*(z) is otherwise the same as 
before. 

In calculating the approximation V*(z) by the Rayleigh-Ritz method, one obtains as well 
approximations to Q , H = L,R. Since it requires more work to impose the constraints, it may 
seem that nothing has been gained and, even, something has been lost. However, a moderately 
good ansatz ty H (a, a H ) may yield rather poor results for fi R and yet quite good results for z. It 
is useful to calculate the effective potential from the ansatz as a diagnostic since the qualitative 
features should be reproduced that V*(z) > and that z* is a minimum point of F* with 
Vi(z*) = 0. If one's only interest is in the mean values, then these are more realistic criteria 
of "acceptability" of the approximation than to insist, e.g., that > everywhere. Negative 
density in an insignificant region of x-space might have very little effect on the approximate 
average z*, which could be quite close to the true average z. On the other hand, a failure 
of convexity of V*(z) would doubtless indicate serious errors in z* as an approximation to z. 
Such a "prediction" would need to be discarded as spurious. The condition of convexity of the 
effective potential is not contained in any property of the closure dynamics and it incorporates 
important additional information from the exact Liouville dynamics. 



V(m, h) = A(m, h)m. 



(3.69) 
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(iv) Variational Characterization of Effective Action 

We now show that the time-dependent effective action can also be obtained by a constrained 
variation of the nonequilibrium action functional r[^ /fi ,^' i ]. The proof of this theorem is al- 
most the same as the proof of a corresponding result in quantum field theory due to Jackiw 
and Kerman |l9| . Just as the Symanzik theorem is a constrained version of the familiar quan- 
tum variational principle for energy eigenvalues and eigenvectors, the Jackiw-Kerman theorem 
can be seen as a constrained version of Dirac's Q variational formulation of the Schrodinger 
equation (a quantum analogue of Hamilton's principle). In addition to providing a basis for 
time-dependent Rayleigh-Ritz calculations, the Jackiw-Kerman- type theorem establishes the 
existence of a Lagrangian functional for the effective action. 

Theorem 2 The effective action T[z] for the initial-value problem is the value at the extremum 
point of the functional 

roc 

r[^ R ,y L ]= dt (y L {t),(dt-L)^ R (t)), (3.70) 
Jo 

when that is independently varied over all pairs of time- dependent state vectors subject to the 
constraints for each time t: 

(^ L {t),y R (t)) = 1 (3.71) 

and 

(y L {t),Z^ R {t)} = z(t), (3.72) 
and also to the boundary conditions 

\V R (0))=P & |* x (oo)> = l. (3.73) 

The proof is as follows: 

As in the static case, we use the representation 

W[h] = log(U L , Texp n°° dtL h (t)^j ■ n R ), (3.74) 
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where L^t) = L + h(i)-Z as before but now Q R = Pq, VL l = 1. In other words, 



w[h] = io g (fi L (t),n R (t)), 



where 



\n K (t)} = Texp ( J ds L h {s) ) \n K ), 



(3.75) 



(3.76) 



and, if T denotes "anti-time-ordering," 

\n L (t)) = Texp (J°° ds Ll(s)j \n L ). 
These trajectories are the solutions, respectively, of the initial-value problem 



(3.77) 



d t \n R (t)) = L h (t)\n H (t)) n H (o) = p , 



Ri 



(3.78) 



and of the final- value problem 



d t \n L (t)) = -4(t)|n L (t)) n L (oo) = 1. 



(3.79) 



On the other hand, the variational problem can be solved by the use of Lagrange multipliers 
for the time-dependent constraints: 



S (r{y R , V L ] - J™ dt [h(t)-(^ L {t), Zy R (t)) - X(t)(^ L (t), V R {t)} 



yielding 



and 



In that case we see that 



and 



(d t -L h (t))\* R (t)) = -\(t)\*"(t)) 



Ri 



(d t +Li(t))\* L (t)) = ym L (t)) 



\n R (t)) = exp[ /* ds X(s)] -\y R (t)) 
Uo J 



= 0, (3.80) 



(3.81) 



\n L (t)) = exp 



J ds A* (a) 



\9 L (t)). 



(3.82) 



(3.83) 



(3.84) 
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Substituting these into the Eq.(3.75) and using the overlap constraint, we obtain the expression 
for the cumulant-generating function that 

POO 

W[h] = I dt A(t) 

dt (V L (t), (-d t + L + h(t)-Z)* fl (t)). (3.85) 

The last equation was obtained by applying ^> L (t) on the left to Eq.( 3.81| ). Note that, indeed, 
<5W[h]/#h(i) = z(t) by a simple calculation: 
SW[h] 



5h(t) 



ds 



Sh(t) 



re 

<t) + / 
J 

/•OO X 

Z(t) + j o ds \( S )— ^(8)^(8)) 
■(*)■ 



*h(t) 



(3.86) 



To obtain the first line we used Eqs. (|3.81 ),( 3.82 ) and to obtain the last line we used again the 
overlap condition. We therefore get directly from Eq. fl3.85Q that 



r[z] 



dt h(i)-z(i) - W[h] 



dt {*\t),(dt-L)9 R (t)) 



(3.87) 



as was claimed. □ 



As remarked above, the quantity 



(3.88) 



can be taken as a Lagrangian functional in terms of which T = dt£(t), i.e. a time-density 
for the effective action. 

On the basis of this theorem a practical Rayleigh-Ritz scheme may be devised. If the varia- 



tion described in the theorem is carried out within a finite-parameter ansatz such as Eqs.( 2.50 ) 
for VP , H = L,R, then the problem reduces to determining stationary points of a parametric 
action 



T[a; h] = / dt [m(^{t))Ui(t) - H(a(t)) - h(t). (Z(a(t)) - z{t)) + \{t) (M(a{t)) - 1)] 
Jo 



(3.89) 
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which incorporates the constraints by Lagrange multipliers h(t),\(t). We have defined 



and 



J\f(a) = (* L (a),* R (cE)). 



(3.90) 



(3.91) 



As in the static case, the ansatz Eqs.( 2.50 ) may need to be "augmented" to allow for the fact 
that ^ L (t) 1 when h(i) ^ 0. We will consider here briefly just the simplest situation, where 
^ H = \a H ), H = L, R, with ty L given by Eq.( 3.35| ) and the a R parameters taken just to 
be the corresponding moments m, as in Eqs.( 3~66|) -( ^.69[) . In this case, the parametric action 
takes the form 



r[m,a L ;h] = 



dt 



a L (t)-m(t) - a L (t)-V(m(t), h(i)) + X(t)(a^(t)-m(t) - 1) , (3.92) 



neglecting some terms independent of the parameters being varied. The corresponding Euler- 
Lagrange equations are 

m(t) = V(m(t), h(t)) - \(t)m(t), (3.93) 
a L {t) + A(m(t),h(t))a L (t) = X(t)a L (t), 
a L (t)-m(t) = 1, 



(3.94) 
(3.95) 



with the boundary conditions at initial and final times: 



m(0) = m , Q L (+oo) = (1, 0) A(+oo) = 0. 



(3.96) 



These equations should be compared with their static counterparts, Eqs.( 3*1361 ), ( |3.69| ), For a 
specified h(t), this two-point boundary value problem may be solved numerically by standard 
methods: see |f47|| , Ch. 17. For small h(i), the best numerical scheme is probably the relaxation 
method, because an exact solution is known for the system at h(i) = 0, corresponding to a 
solution Tn(t) of the moment-closure dynamics with specified initial data m(0) = mo and to 
Q. L {t) = (1,0), X(t) = 0. This known solution for ho(t) = may then be input as an initial 



38 



guess into a relaxation algorithm to find the solution with some small hi(i), and, iteratively, a 
sequence of solutions with h^(t) of increasing magnitude constructed. In this way, the fluctua- 
tions around the predicted dynamical trajectory m(t) of the moment closure may be explored 
in the Rayleigh-Ritz method by varying h(i). The method then yields an approximate effective 
action 

r*[z]=/ dt a5(t)-m*(t)-ai(t)-V(m*(t)) , (3.97) 
Jo 1 J 

in which m*(i), a%(t), A*(i) are solutions of the initial-final value problem Eqs.( 3.94 ) - (|3.95| ), 
with h(t) selected so that 

z^(t) = a5(t)-%n(t) (3.98) 

equals the specified z^(t). We have defined Z*^(m) = (Z^^m- 
Equivalently, the approximate action may be written as 

/■oo 

r*[z] = / dt [z„(t)-h(t) - A*(i)] . (3.99) 
J o 

This can be compared with the approximate effective potential in Proposition 1. If we define the 
approximate generating functional W*[h] = Jq° dt A*(i), then it also follows as in Proposition 
1 that 

= «*">■ ^ 

Thus, the approximate effective action from the Rayleigh-Ritz method, Eq.( |3.97 ) or Eq.( 3.99| ), 
retains the Legendre transform structure of the true effective action. It is not hard to derive 
from this fact that 

r*[z*] = & tt^[z*] = 0. (3.101) 
oz{t) 

where z*(i) = (z)-m(t) is the expected value of z in the PDF ansatz calculated along the trajec- 
tory m(i) of the moment-closure. Hence, the predicted mean-history z*(t) is guaranteed to be 
a stationary point of T*[z], but not necessarily a minimum point. 
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Recently, an alternative nonperturbative approximation to the nonequilibrium effective ac- 
tion has been developed by Crisanti & Marconi [48], via a dynamical Hartree approximation. 
While the two approximation schemes are similar in spirit, there are essential differences be- 
tween them. We present here no detailed comparison of the two techniques. However, we 
believe it is a virtue of the present method that it allows an approximation of the effective 
action and effective potential within any PDF ansatz that may be proposed. Furthermore, 
it makes direct connection with the moment-closure equations which have been traditionally 
used in nonequilbrium statistical dynamics. We believe that the combination of flexibility to 
incorporate intuitive guesses and transparency of the physical interpretation should give the 
present method far-reaching applications. 
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4 Appendix: General Variational Equations 

The most general trial ansatz has the form ty H = ty H (a, ol h ), H = L,R with N = N + N R . 
In this case, the parametric Hamiltonian is calculated as 

H{a, a R , a L ) = (^ L {a, a L ),L^ R (a, a R )). (4.1) 

Correspondingly, the fixed point conditions are 

Tr^(a, a R , ol l ), = (a, a L ),L^ R (a, a R )) = (4.2) 
dcti 

and 

™ i ( a , a R ,a L ) = ^ L (a,a L ),L^ R (a,a R )) = (4.3) 

and 

"( a , a R , a L ) = <— (c*,a L ),L*V,^)> + (y L (a,a L ),L——(a,a R )) = (4.4) 
oaf OdLi octi 

with r/>f = Hh-, H = R,L. Within the same ansatz, the parametric evolution equations have 



the form 



and 



d1~~t 

{ai,aj}ctj + {ai,a R }af + {aj,aj"}dj" = ^— (a, a R , a L ), (4.5) 



{af , a^dy + {of, aj-}d* = ^(«, a R , a L ), (4.6) 



and 



{af,aj}aj + {of ,af }df = — r (a, o; i ). (4.7) 
The most general ansatz of any obvious utility is that given in Eq. fl2.33j ): 

f L = 1 + "f^f (<*) & ^ R = ^ R (a). (4.8) 
i=l 

This may be thought to correspond to the previous ansatz with N R = 0, N L = N and with a 
linear dependence of ty L on the a L . For this case, the parametric Hamiltonian is 

N 

H(a,a L ) = Y,a L Vi(<x) (4-9) 



i=l 
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with Vi(a) = (L^ipi(a))ct the dynamical vector field in the parameter space, as in Eq. Q2.78| ). 
The fixed point conditions are simply 

Vi(a) = (4.10) 

and 



da; 



(a) = 



(4.11) 



for i = 1,...,N. When the stability matrix at a fixed point a* of the first equation ( f4.10| ) is 
non-singular, det f^(o;*) ^ 0, then the only solution of the second equation is a L = 0. The 
parametric evolution equations within the same ansatz are 



{cii, aj}ctj + {af,ctj}aj = Vi(a) 



(4.12) 



and 



where the Lagrange brackets are 



N 



dVi 



3 J 3 



dai 



{ai,aj} = 5Z a fe 

k=l 



(M (a)) ^ (a)) _ ( M (a)) ^ (a)) 



a: 



and 



(4.13) 



(4.14) 



(4.15) 



{af,aj} = (Vf(a),Vf(«)) 

with now ij}^ = . The second equation clearly has the constant solution a L (t) = 0. The 
first equation then has the same form as Eq.( 2.77|) in the text. It is also identical with Eq.(2.11) 
in the work of Bayly pq] , but here derived by the variational method. 
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